Search CORE

215 research outputs found

Defining and Evaluating Network Communities based on Ground-truth

Author: Leskovec Jure
Yang Jaewon
Publication venue
Publication date: 06/11/2012
Field of study

Nodes in real-world networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task mainly due to a plethora of definitions of a community, intractability of algorithms, issues with evaluation and the lack of a reliable gold-standard ground-truth. In this paper we study a set of 230 large real-world social, collaboration and information networks where nodes explicitly state their group memberships. For example, in social networks nodes explicitly join various interest based social groups. We use such groups to define a reliable and robust notion of ground-truth communities. We then propose a methodology which allows us to compare and quantitatively evaluate how different structural definitions of network communities correspond to ground-truth communities. We choose 13 commonly used structural definitions of network communities and examine their sensitivity, robustness and performance in identifying the ground-truth. We show that the 13 structural definitions are heavily correlated and naturally group into four classes. We find that two of these definitions, Conductance and Triad-participation-ratio, consistently give the best performance in identifying ground-truth communities. We also investigate a task of detecting communities given a single seed node. We extend the local spectral clustering algorithm into a heuristic parameter-free community detection method that easily scales to networks with more than hundred million nodes. The proposed method achieves 30% relative improvement over current local clustering methods.Comment: Proceedings of 2012 IEEE International Conference on Data Mining (ICDM), 201

arXiv.org e-Print Archive

CiteSeerX

Community Detection in Networks with Node Attributes

Author: Leskovec Jure
McAuley Julian
Yang Jaewon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Community detection algorithms are fundamental tools that allow us to uncover organizational principles in networks. When detecting communities, there are two possible sources of information one can use: the network structure, and the features and attributes of nodes. Even though communities form around nodes that have common edges and common attributes, typically, algorithms have only focused on one of these two data modalities: community detection algorithms traditionally focus only on the network structure, while clustering algorithms mostly consider only node attributes. In this paper, we develop Communities from Edge Structure and Node Attributes (CESNA), an accurate and scalable algorithm for detecting overlapping communities in networks with node attributes. CESNA statistically models the interaction between the network structure and the node attributes, which leads to more accurate community detection as well as improved robustness in the presence of noise in the network structure. CESNA has a linear runtime in the network size and is able to process networks an order of magnitude larger than comparable approaches. Last, CESNA also helps with the interpretation of detected communities by finding relevant node attributes for each community.Comment: Published in the proceedings of IEEE ICDM '1

arXiv.org e-Print Archive

CiteSeerX

Crossref

Detecting Cohesive and 2-mode Communities in Directed and Undirected Networks

Author: Leskovec Jure
McAuley Julian
Yang Jaewon
Publication venue
Publication date: 28/01/2014
Field of study

Networks are a general language for representing relational information among objects. An effective way to model, reason about, and summarize networks, is to discover sets of nodes with common connectivity patterns. Such sets are commonly referred to as network communities. Research on network community detection has predominantly focused on identifying communities of densely connected nodes in undirected networks. In this paper we develop a novel overlapping community detection method that scales to networks of millions of nodes and edges and advances research along two dimensions: the connectivity structure of communities, and the use of edge directedness for community detection. First, we extend traditional definitions of network communities by building on the observation that nodes can be densely interlinked in two different ways: In cohesive communities nodes link to each other, while in 2-mode communities nodes link in a bipartite fashion, where links predominate between the two partitions rather than inside them. Our method successfully detects both 2-mode as well as cohesive communities, that may also overlap or be hierarchically nested. Second, while most existing community detection methods treat directed edges as though they were undirected, our method accounts for edge directions and is able to identify novel and meaningful community structures in both directed and undirected networks, using data from social, biological, and ecological domains.Comment: Published in the proceedings of WSDM '1

arXiv.org e-Print Archive

CiteSeerX

Salience and Market-aware Skill Extraction for Job Targeting

Author: Guo Feng
He Qi
Shi Baoxu
Yang Jaewon
Publication venue
Publication date: 26/05/2020
Field of study

At LinkedIn, we want to create economic opportunity for everyone in the global workforce. To make this happen, LinkedIn offers a reactive Job Search system, and a proactive Jobs You May Be Interested In (JYMBII) system to match the best candidates with their dream jobs. One of the most challenging tasks for developing these systems is to properly extract important skill entities from job postings and then target members with matched attributes. In this work, we show that the commonly used text-based \emph{salience and market-agnostic} skill extraction approach is sub-optimal because it only considers skill mention and ignores the salient level of a skill and its market dynamics, i.e., the market supply and demand influence on the importance of skills. To address the above drawbacks, we present \model, our deployed \emph{salience and market-aware} skill extraction system. The proposed \model ~shows promising results in improving the online performance of job recommendation (JYMBII) (

+1.92\%

job apply) and skill suggestions for job posters (

-37\%

suggestion rejection rate). Lastly, we present case studies to show interesting insights that contrast traditional skill recognition method and the proposed \model~from occupation, industry, country, and individual skill levels. Based on the above promising results, we deployed the \model ~online to extract job targeting skills for all

20

M job postings served at LinkedIn.Comment: 9 pages, to appear in KDD202

arXiv.org e-Print Archive

Crossref

Attenuation correction for brain PET imaging using deep neural network based on dixon and ZTE MR images

Author: Fakhri Georges El
Gong Kuang
Kim Kyungsang
Li Quanzheng
Seo Youngho
Yang Jaewon
Publication venue: 'IOP Publishing'
Publication date: 24/05/2018
Field of study

Positron Emission Tomography (PET) is a functional imaging modality widely used in neuroscience studies. To obtain meaningful quantitative results from PET images, attenuation correction is necessary during image reconstruction. For PET/MR hybrid systems, PET attenuation is challenging as Magnetic Resonance (MR) images do not reflect attenuation coefficients directly. To address this issue, we present deep neural network methods to derive the continuous attenuation coefficients for brain PET imaging from MR images. With only Dixon MR images as the network input, the existing U-net structure was adopted and analysis using forty patient data sets shows it is superior than other Dixon based methods. When both Dixon and zero echo time (ZTE) images are available, we have proposed a modified U-net structure, named GroupU-net, to efficiently make use of both Dixon and ZTE information through group convolution modules when the network goes deeper. Quantitative analysis based on fourteen real patient data sets demonstrates that both network approaches can perform better than the standard methods, and the proposed network structure can further reduce the PET quantification error compared to the U-net structure.Comment: 15 pages, 12 figure

arXiv.org e-Print Archive

eScholarship - University of California

Recommended from our members

Effect of Time-of-Flight and Regularized Reconstructions on Quantitative Measurements and Qualitative Assessments in Newly Diagnosed Prostate Cancer With 18F-Fluorocholine Dual Time Point PET/MRI.

Author: Behr Spencer C
Flavell Robert R
Hawkins Randall A
Mollard Brett J
Seo Youngho
Yang Jaewon
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

Recent technical advances in positron emission tomography/magnetic resonance imaging (PET/MRI) technology allow much improved time-of-flight (TOF) and regularized iterative PET reconstruction regularized iterative reconstruction (RIR) algorithms. We evaluated the effect of TOF and RIR on standardized uptake values (maximum and peak SUV [SUVmax and SUVpeak]) and their metabolic tumor volume dependencies and visual image quality for 18F-fluorocholine PET/MRI in patients with newly diagnosed prostate cancer. Fourteen patients were administered with 3 MBq/kg of 18F-fluorocholine and scanned dynamically for 30 minutes. Positron emission tomography images were divided to early and late time points (1-6 minutes summed and 7-30 minutes summed). The values of the different SUVs were documented for dominant PET-avid lesions, and metabolic tumor volume was estimated using a 50% isocontour and SUV threshold of 2.5. Image quality was assessed via visual acuity scoring (VAS). We found that incorporation of TOF or RIR increased lesion SUVs. The lesion to background ratio was not improved by TOF reconstruction, while RIR improved the lesion to background ratio significantly ( P < .05). The values of the different VAS were all significantly higher ( P < .05) for RIR images over TOF, RIR over non-TOF, and TOF over non-TOF. In conclusion, our data indicate that TOF or RIR should be incorporated into current protocols when available

eScholarship - University of California